Overview

Dataset statistics

Number of variables19
Number of observations16228
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.0 MiB
Average record size in memory132.0 B

Variable types

Numeric12
Categorical7

Alerts

grade is highly correlated with bathrooms and 7 other fieldsHigh correlation
bathrooms is highly correlated with grade and 3 other fieldsHigh correlation
bedrooms is highly correlated with grade and 2 other fieldsHigh correlation
sqft_above is highly correlated with grade and 7 other fieldsHigh correlation
sqft_living15 is highly correlated with grade and 3 other fieldsHigh correlation
floors is highly correlated with grade and 4 other fieldsHigh correlation
sqft_lot is highly correlated with zipcode and 3 other fieldsHigh correlation
price is highly correlated with grade and 3 other fieldsHigh correlation
sqft_lot15 is highly correlated with zipcode and 3 other fieldsHigh correlation
sqft_living is highly correlated with grade and 6 other fieldsHigh correlation
antiguedad_venta is highly correlated with zipcode and 7 other fieldsHigh correlation
waterfront is highly correlated with viewHigh correlation
view is highly correlated with waterfrontHigh correlation
zipcode is highly correlated with sqft_lot and 2 other fieldsHigh correlation
sqft_basement is highly correlated with sqft_livingHigh correlation
condition is highly correlated with antiguedad_ventaHigh correlation
df_index has unique values Unique
sqft_basement has 10173 (62.7%) zeros Zeros
yr_renovated has 15641 (96.4%) zeros Zeros
antiguedad_venta has 302 (1.9%) zeros Zeros

Reproduction

Analysis started2022-10-02 04:29:24.740179
Analysis finished2022-10-02 04:30:03.329709
Duration38.59 seconds
Software versionpandas-profiling v3.3.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct16228
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18813.10118
Minimum1
Maximum113866
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size126.9 KiB
2022-10-01T23:30:03.574394image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1133.35
Q16227.75
median14532.5
Q327261.5
95-th percentile51233.85
Maximum113866
Range113865
Interquartile range (IQR)21033.75

Descriptive statistics

Standard deviation16137.5866
Coefficient of variation (CV)0.8577845002
Kurtosis1.640962296
Mean18813.10118
Median Absolute Deviation (MAD)9668
Skewness1.266324341
Sum305299006
Variance260421701.1
MonotonicityNot monotonic
2022-10-01T23:30:03.794410image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
198571
 
< 0.1%
47711
 
< 0.1%
31701
 
< 0.1%
272141
 
< 0.1%
451841
 
< 0.1%
133731
 
< 0.1%
214471
 
< 0.1%
112411
 
< 0.1%
685321
 
< 0.1%
275331
 
< 0.1%
Other values (16218)16218
99.9%
ValueCountFrequency (%)
11
< 0.1%
21
< 0.1%
31
< 0.1%
51
< 0.1%
81
< 0.1%
91
< 0.1%
101
< 0.1%
121
< 0.1%
141
< 0.1%
151
< 0.1%
ValueCountFrequency (%)
1138661
< 0.1%
1119061
< 0.1%
1095711
< 0.1%
1083111
< 0.1%
992971
< 0.1%
981951
< 0.1%
980151
< 0.1%
947681
< 0.1%
940831
< 0.1%
937391
< 0.1%

zipcode
Real number (ℝ≥0)

HIGH CORRELATION

Distinct70
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean98078.55928
Minimum98001
Maximum98199
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size63.5 KiB
2022-10-01T23:30:04.025428image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum98001
5-th percentile98005
Q198033
median98065
Q398118
95-th percentile98177
Maximum98199
Range198
Interquartile range (IQR)85

Descriptive statistics

Standard deviation53.24375101
Coefficient of variation (CV)0.0005428684047
Kurtosis-0.8597956826
Mean98078.55928
Median Absolute Deviation (MAD)42
Skewness0.3980821458
Sum1591618860
Variance2834.897022
MonotonicityNot monotonic
2022-10-01T23:30:04.241111image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
98038469
 
2.9%
98103468
 
2.9%
98115462
 
2.8%
98052455
 
2.8%
98042445
 
2.7%
98117433
 
2.7%
98034429
 
2.6%
98023399
 
2.5%
98133397
 
2.4%
98118393
 
2.4%
Other values (60)11878
73.2%
ValueCountFrequency (%)
98001294
1.8%
98002155
1.0%
98003212
1.3%
98004126
 
0.8%
98005122
 
0.8%
98006322
2.0%
98007107
 
0.7%
98008213
1.3%
9801090
 
0.6%
98011150
0.9%
ValueCountFrequency (%)
98199212
1.3%
98198221
1.4%
98188105
 
0.6%
98178209
1.3%
98177179
1.1%
98168218
1.3%
98166193
1.2%
98155364
2.2%
9814851
 
0.3%
98146219
1.3%

grade
Real number (ℝ≥0)

HIGH CORRELATION

Distinct10
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.516144935
Minimum3
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size63.5 KiB
2022-10-01T23:30:04.443129image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile6
Q17
median7
Q38
95-th percentile9
Maximum12
Range9
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.02266091
Coefficient of variation (CV)0.1360618933
Kurtosis0.7856244589
Mean7.516144935
Median Absolute Deviation (MAD)1
Skewness0.4890425822
Sum121972
Variance1.045835338
MonotonicityNot monotonic
2022-10-01T23:30:04.592139image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
77176
44.2%
84737
29.2%
91822
 
11.2%
61622
 
10.0%
10557
 
3.4%
5188
 
1.2%
1197
 
0.6%
423
 
0.1%
33
 
< 0.1%
123
 
< 0.1%
ValueCountFrequency (%)
33
 
< 0.1%
423
 
0.1%
5188
 
1.2%
61622
 
10.0%
77176
44.2%
84737
29.2%
91822
 
11.2%
10557
 
3.4%
1197
 
0.6%
123
 
< 0.1%
ValueCountFrequency (%)
123
 
< 0.1%
1197
 
0.6%
10557
 
3.4%
91822
 
11.2%
84737
29.2%
77176
44.2%
61622
 
10.0%
5188
 
1.2%
423
 
0.1%
33
 
< 0.1%

sqft_basement
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct242
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean258.5320434
Minimum0
Maximum2720
Zeros10173
Zeros (%)62.7%
Negative0
Negative (%)0.0%
Memory size126.9 KiB
2022-10-01T23:30:04.791166image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q3500
95-th percentile1080
Maximum2720
Range2720
Interquartile range (IQR)500

Descriptive statistics

Standard deviation399.2837658
Coefficient of variation (CV)1.544426604
Kurtosis1.321276483
Mean258.5320434
Median Absolute Deviation (MAD)0
Skewness1.445927685
Sum4195458
Variance159427.5256
MonotonicityNot monotonic
2022-10-01T23:30:05.012891image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
010173
62.7%
600178
 
1.1%
500172
 
1.1%
700154
 
0.9%
800148
 
0.9%
400138
 
0.9%
300106
 
0.7%
900105
 
0.6%
1000105
 
0.6%
48089
 
0.5%
Other values (232)4860
29.9%
ValueCountFrequency (%)
010173
62.7%
102
 
< 0.1%
201
 
< 0.1%
404
 
< 0.1%
507
 
< 0.1%
609
 
0.1%
651
 
< 0.1%
705
 
< 0.1%
8013
 
0.1%
9017
 
0.1%
ValueCountFrequency (%)
27201
< 0.1%
26001
< 0.1%
23001
< 0.1%
22501
< 0.1%
21701
< 0.1%
21601
< 0.1%
21501
< 0.1%
21101
< 0.1%
20901
< 0.1%
20801
< 0.1%

view
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size919.3 KiB
0
15011 
2
 
621
3
 
263
1
 
231
4
 
102

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters16228
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
015011
92.5%
2621
 
3.8%
3263
 
1.6%
1231
 
1.4%
4102
 
0.6%

Length

2022-10-01T23:30:05.205905image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-01T23:30:05.421273image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
015011
92.5%
2621
 
3.8%
3263
 
1.6%
1231
 
1.4%
4102
 
0.6%

Most occurring characters

ValueCountFrequency (%)
015011
92.5%
2621
 
3.8%
3263
 
1.6%
1231
 
1.4%
4102
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number16228
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
015011
92.5%
2621
 
3.8%
3263
 
1.6%
1231
 
1.4%
4102
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Common16228
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
015011
92.5%
2621
 
3.8%
3263
 
1.6%
1231
 
1.4%
4102
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII16228
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
015011
92.5%
2621
 
3.8%
3263
 
1.6%
1231
 
1.4%
4102
 
0.6%

bathrooms
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size951.0 KiB
2.0
8162 
1.0
6715 
3.0
1265 
4.0
 
86

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters48684
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.0
2nd row1.0
3rd row2.0
4th row1.0
5th row2.0

Common Values

ValueCountFrequency (%)
2.08162
50.3%
1.06715
41.4%
3.01265
 
7.8%
4.086
 
0.5%

Length

2022-10-01T23:30:05.589581image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-01T23:30:05.791935image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
2.08162
50.3%
1.06715
41.4%
3.01265
 
7.8%
4.086
 
0.5%

Most occurring characters

ValueCountFrequency (%)
.16228
33.3%
016228
33.3%
28162
16.8%
16715
13.8%
31265
 
2.6%
486
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number32456
66.7%
Other Punctuation16228
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
016228
50.0%
28162
25.1%
16715
20.7%
31265
 
3.9%
486
 
0.3%
Other Punctuation
ValueCountFrequency (%)
.16228
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common48684
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.16228
33.3%
016228
33.3%
28162
16.8%
16715
13.8%
31265
 
2.6%
486
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII48684
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.16228
33.3%
016228
33.3%
28162
16.8%
16715
13.8%
31265
 
2.6%
486
 
0.2%

bedrooms
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size951.0 KiB
3.0
7708 
4.0
5078 
2.0
2205 
5.0
1072 
1.0
 
165

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters48684
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3.0
2nd row3.0
3rd row4.0
4th row5.0
5th row3.0

Common Values

ValueCountFrequency (%)
3.07708
47.5%
4.05078
31.3%
2.02205
 
13.6%
5.01072
 
6.6%
1.0165
 
1.0%

Length

2022-10-01T23:30:05.951948image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-01T23:30:06.142962image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
3.07708
47.5%
4.05078
31.3%
2.02205
 
13.6%
5.01072
 
6.6%
1.0165
 
1.0%

Most occurring characters

ValueCountFrequency (%)
.16228
33.3%
016228
33.3%
37708
15.8%
45078
 
10.4%
22205
 
4.5%
51072
 
2.2%
1165
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number32456
66.7%
Other Punctuation16228
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
016228
50.0%
37708
23.7%
45078
 
15.6%
22205
 
6.8%
51072
 
3.3%
1165
 
0.5%
Other Punctuation
ValueCountFrequency (%)
.16228
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common48684
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.16228
33.3%
016228
33.3%
37708
15.8%
45078
 
10.4%
22205
 
4.5%
51072
 
2.2%
1165
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII48684
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.16228
33.3%
016228
33.3%
37708
15.8%
45078
 
10.4%
22205
 
4.5%
51072
 
2.2%
1165
 
0.3%

sqft_above
Real number (ℝ≥0)

HIGH CORRELATION

Distinct747
Distinct (%)4.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1696.06002
Minimum380
Maximum5710
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size126.9 KiB
2022-10-01T23:30:06.332977image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum380
5-th percentile840
Q11170
median1510
Q32090
95-th percentile3130
Maximum5710
Range5330
Interquartile range (IQR)920

Descriptive statistics

Standard deviation714.9626592
Coefficient of variation (CV)0.4215432537
Kurtosis0.9715370441
Mean1696.06002
Median Absolute Deviation (MAD)410
Skewness1.075612747
Sum27523662
Variance511171.6041
MonotonicityNot monotonic
2022-10-01T23:30:06.544994image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1200164
 
1.0%
1300159
 
1.0%
1010157
 
1.0%
1400152
 
0.9%
1220148
 
0.9%
1340148
 
0.9%
1180146
 
0.9%
1140144
 
0.9%
1060143
 
0.9%
1100138
 
0.9%
Other values (737)14729
90.8%
ValueCountFrequency (%)
3801
 
< 0.1%
3901
 
< 0.1%
4202
 
< 0.1%
4301
 
< 0.1%
4401
 
< 0.1%
4702
 
< 0.1%
4804
< 0.1%
4902
 
< 0.1%
5002
 
< 0.1%
5206
< 0.1%
ValueCountFrequency (%)
57101
< 0.1%
54801
< 0.1%
54501
< 0.1%
53201
< 0.1%
52501
< 0.1%
51901
< 0.1%
50701
< 0.1%
49301
< 0.1%
48501
< 0.1%
47502
< 0.1%

sqft_living15
Real number (ℝ≥0)

HIGH CORRELATION

Distinct662
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1914.327089
Minimum399
Maximum5380
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size126.9 KiB
2022-10-01T23:30:06.815999image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum399
5-th percentile1120
Q11470
median1790
Q32270
95-th percentile3080
Maximum5380
Range4981
Interquartile range (IQR)800

Descriptive statistics

Standard deviation608.6824722
Coefficient of variation (CV)0.3179615833
Kurtosis0.7684819671
Mean1914.327089
Median Absolute Deviation (MAD)380
Skewness0.8938470952
Sum31065700
Variance370494.352
MonotonicityNot monotonic
2022-10-01T23:30:07.119748image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1440158
 
1.0%
1560155
 
1.0%
1540154
 
0.9%
1500148
 
0.9%
1460144
 
0.9%
1580140
 
0.9%
1720137
 
0.8%
1480133
 
0.8%
1620133
 
0.8%
1520133
 
0.8%
Other values (652)14793
91.2%
ValueCountFrequency (%)
3991
 
< 0.1%
4601
 
< 0.1%
6202
 
< 0.1%
6701
 
< 0.1%
6902
 
< 0.1%
7002
 
< 0.1%
7101
 
< 0.1%
7202
 
< 0.1%
7405
< 0.1%
7502
 
< 0.1%
ValueCountFrequency (%)
53801
< 0.1%
49501
< 0.1%
49201
< 0.1%
46401
< 0.1%
46001
< 0.1%
45901
< 0.1%
45301
< 0.1%
45101
< 0.1%
44951
< 0.1%
44901
< 0.1%

waterfront
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size919.3 KiB
0
16181 
1
 
47

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters16228
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
016181
99.7%
147
 
0.3%

Length

2022-10-01T23:30:07.329997image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-01T23:30:07.546018image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
016181
99.7%
147
 
0.3%

Most occurring characters

ValueCountFrequency (%)
016181
99.7%
147
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number16228
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
016181
99.7%
147
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
Common16228
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
016181
99.7%
147
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII16228
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
016181
99.7%
147
 
0.3%

floors
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size951.0 KiB
1.0
9789 
2.0
5974 
3.0
 
465

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters48684
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.0
2nd row1.0
3rd row2.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.09789
60.3%
2.05974
36.8%
3.0465
 
2.9%

Length

2022-10-01T23:30:07.742043image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-01T23:30:07.925507image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
1.09789
60.3%
2.05974
36.8%
3.0465
 
2.9%

Most occurring characters

ValueCountFrequency (%)
.16228
33.3%
016228
33.3%
19789
20.1%
25974
 
12.3%
3465
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number32456
66.7%
Other Punctuation16228
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
016228
50.0%
19789
30.2%
25974
 
18.4%
3465
 
1.4%
Other Punctuation
ValueCountFrequency (%)
.16228
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common48684
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.16228
33.3%
016228
33.3%
19789
20.1%
25974
 
12.3%
3465
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII48684
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.16228
33.3%
016228
33.3%
19789
20.1%
25974
 
12.3%
3465
 
1.0%

yr_renovated
Real number (ℝ≥0)

ZEROS

Distinct70
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean72.17130885
Minimum0
Maximum2015
Zeros15641
Zeros (%)96.4%
Negative0
Negative (%)0.0%
Memory size126.9 KiB
2022-10-01T23:30:08.113521image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum2015
Range2015
Interquartile range (IQR)0

Descriptive statistics

Standard deviation372.5691967
Coefficient of variation (CV)5.162289595
Kurtosis22.6983639
Mean72.17130885
Median Absolute Deviation (MAD)0
Skewness4.969257204
Sum1171196
Variance138807.8064
MonotonicityNot monotonic
2022-10-01T23:30:08.341523image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
015641
96.4%
201468
 
0.4%
201327
 
0.2%
200024
 
0.1%
200722
 
0.1%
200519
 
0.1%
200318
 
0.1%
199017
 
0.1%
200617
 
0.1%
200916
 
0.1%
Other values (60)359
 
2.2%
ValueCountFrequency (%)
015641
96.4%
19341
 
< 0.1%
19402
 
< 0.1%
19441
 
< 0.1%
19452
 
< 0.1%
19462
 
< 0.1%
19481
 
< 0.1%
19502
 
< 0.1%
19511
 
< 0.1%
19533
 
< 0.1%
ValueCountFrequency (%)
201512
 
0.1%
201468
0.4%
201327
 
0.2%
20129
 
0.1%
20118
 
< 0.1%
20108
 
< 0.1%
200916
 
0.1%
200810
 
0.1%
200722
 
0.1%
200617
 
0.1%

sqft_lot
Real number (ℝ≥0)

HIGH CORRELATION

Distinct6401
Distinct (%)39.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7151.695075
Minimum520
Maximum17622
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size126.9 KiB
2022-10-01T23:30:08.576555image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum520
5-th percentile1688.35
Q15000
median7151.695075
Q38800
95-th percentile13178.4
Maximum17622
Range17102
Interquartile range (IQR)3800

Descriptive statistics

Standard deviation3195.496209
Coefficient of variation (CV)0.4468166184
Kurtosis0.4870625894
Mean7151.695075
Median Absolute Deviation (MAD)1993
Skewness0.4849176236
Sum116057707.7
Variance10211196.02
MonotonicityNot monotonic
2022-10-01T23:30:08.787558image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7151.6950751772
 
10.9%
5000270
 
1.7%
6000214
 
1.3%
4000191
 
1.2%
7200162
 
1.0%
480093
 
0.6%
960090
 
0.6%
450090
 
0.6%
750089
 
0.5%
840083
 
0.5%
Other values (6391)13174
81.2%
ValueCountFrequency (%)
5201
< 0.1%
6001
< 0.1%
6351
< 0.1%
6381
< 0.1%
6492
< 0.1%
6511
< 0.1%
6761
< 0.1%
6811
< 0.1%
6831
< 0.1%
6902
< 0.1%
ValueCountFrequency (%)
176221
 
< 0.1%
176003
< 0.1%
175851
 
< 0.1%
175831
 
< 0.1%
175771
 
< 0.1%
175501
 
< 0.1%
175411
 
< 0.1%
175321
 
< 0.1%
175001
 
< 0.1%
174871
 
< 0.1%

price
Real number (ℝ≥0)

HIGH CORRELATION

Distinct3033
Distinct (%)18.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean471191.77
Minimum75000
Maximum1130000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size126.9 KiB
2022-10-01T23:30:09.013591image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum75000
5-th percentile209640
Q1314000
median432500
Q3597500
95-th percentile865000
Maximum1130000
Range1055000
Interquartile range (IQR)283500

Descriptive statistics

Standard deviation202745.1953
Coefficient of variation (CV)0.4302816989
Kurtosis-0.06389500165
Mean471191.77
Median Absolute Deviation (MAD)133500
Skewness0.7198897063
Sum7646500043
Variance4.110561421 × 1010
MonotonicityNot monotonic
2022-10-01T23:30:09.240607image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
350000141
 
0.9%
450000135
 
0.8%
425000131
 
0.8%
500000128
 
0.8%
550000128
 
0.8%
325000116
 
0.7%
375000116
 
0.7%
400000110
 
0.7%
250000106
 
0.7%
300000105
 
0.6%
Other values (3023)15012
92.5%
ValueCountFrequency (%)
750001
< 0.1%
780001
< 0.1%
800001
< 0.1%
810001
< 0.1%
820001
< 0.1%
825001
< 0.1%
830001
< 0.1%
840001
< 0.1%
850002
< 0.1%
890001
< 0.1%
ValueCountFrequency (%)
11300004
 
< 0.1%
11225001
 
< 0.1%
11202801
 
< 0.1%
11200004
 
< 0.1%
11155001
 
< 0.1%
11127501
 
< 0.1%
11100006
 
< 0.1%
11039901
 
< 0.1%
11020301
 
< 0.1%
110000024
0.1%

condition
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size919.3 KiB
3
10574 
4
4279 
5
1217 
2
 
134
1
 
24

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters16228
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row3
3rd row3
4th row3
5th row3

Common Values

ValueCountFrequency (%)
310574
65.2%
44279
26.4%
51217
 
7.5%
2134
 
0.8%
124
 
0.1%

Length

2022-10-01T23:30:10.176679image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-01T23:30:10.363693image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
310574
65.2%
44279
26.4%
51217
 
7.5%
2134
 
0.8%
124
 
0.1%

Most occurring characters

ValueCountFrequency (%)
310574
65.2%
44279
26.4%
51217
 
7.5%
2134
 
0.8%
124
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number16228
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
310574
65.2%
44279
26.4%
51217
 
7.5%
2134
 
0.8%
124
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common16228
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
310574
65.2%
44279
26.4%
51217
 
7.5%
2134
 
0.8%
124
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII16228
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
310574
65.2%
44279
26.4%
51217
 
7.5%
2134
 
0.8%
124
 
0.1%

sqft_lot15
Real number (ℝ≥0)

HIGH CORRELATION

Distinct6011
Distinct (%)37.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7286.342928
Minimum659
Maximum20000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size126.9 KiB
2022-10-01T23:30:10.567708image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum659
5-th percentile1916
Q15039.5
median7286.342928
Q38869
95-th percentile13129
Maximum20000
Range19341
Interquartile range (IQR)3829.5

Descriptive statistics

Standard deviation3231.976368
Coefficient of variation (CV)0.4435663267
Kurtosis1.3278936
Mean7286.342928
Median Absolute Deviation (MAD)2013.657072
Skewness0.7161463828
Sum118242773
Variance10445671.24
MonotonicityNot monotonic
2022-10-01T23:30:10.783562image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7286.3429281289
 
7.9%
5000325
 
2.0%
4000275
 
1.7%
6000222
 
1.4%
7200161
 
1.0%
7500108
 
0.7%
4800103
 
0.6%
450092
 
0.6%
840085
 
0.5%
360084
 
0.5%
Other values (6001)13484
83.1%
ValueCountFrequency (%)
6591
 
< 0.1%
6601
 
< 0.1%
7481
 
< 0.1%
7503
< 0.1%
7551
 
< 0.1%
7581
 
< 0.1%
7941
 
< 0.1%
8102
< 0.1%
8863
< 0.1%
8871
 
< 0.1%
ValueCountFrequency (%)
2000011
0.1%
199982
 
< 0.1%
199651
 
< 0.1%
199611
 
< 0.1%
199391
 
< 0.1%
199161
 
< 0.1%
199081
 
< 0.1%
198781
 
< 0.1%
198682
 
< 0.1%
198561
 
< 0.1%

sqft_living
Real number (ℝ≥0)

HIGH CORRELATION

Distinct797
Distinct (%)4.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1954.592063
Minimum380
Maximum7350
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size126.9 KiB
2022-10-01T23:30:11.010580image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum380
5-th percentile920
Q11390
median1840
Q32410
95-th percentile3350
Maximum7350
Range6970
Interquartile range (IQR)1020

Descriptive statistics

Standard deviation756.0729845
Coefficient of variation (CV)0.3868188144
Kurtosis0.9271512852
Mean1954.592063
Median Absolute Deviation (MAD)500
Skewness0.809911147
Sum31719120
Variance571646.3579
MonotonicityNot monotonic
2022-10-01T23:30:11.211596image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1440112
 
0.7%
1400110
 
0.7%
1300107
 
0.7%
1480103
 
0.6%
1540103
 
0.6%
1010100
 
0.6%
1660100
 
0.6%
1720100
 
0.6%
182099
 
0.6%
156099
 
0.6%
Other values (787)15195
93.6%
ValueCountFrequency (%)
3801
 
< 0.1%
3901
 
< 0.1%
4202
 
< 0.1%
4301
 
< 0.1%
4401
 
< 0.1%
4702
 
< 0.1%
4802
 
< 0.1%
4901
 
< 0.1%
5001
 
< 0.1%
5206
< 0.1%
ValueCountFrequency (%)
73501
< 0.1%
71201
< 0.1%
60501
< 0.1%
58201
< 0.1%
57741
< 0.1%
57101
< 0.1%
56601
< 0.1%
56351
< 0.1%
56101
< 0.1%
54701
< 0.1%

yr_date
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size998.5 KiB
2014.0
11023 
2015.0
5205 

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters97368
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2015.0
2nd row2015.0
3rd row2014.0
4th row2015.0
5th row2015.0

Common Values

ValueCountFrequency (%)
2014.011023
67.9%
2015.05205
32.1%

Length

2022-10-01T23:30:11.409610image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-01T23:30:11.587625image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
2014.011023
67.9%
2015.05205
32.1%

Most occurring characters

ValueCountFrequency (%)
032456
33.3%
216228
16.7%
116228
16.7%
.16228
16.7%
411023
 
11.3%
55205
 
5.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number81140
83.3%
Other Punctuation16228
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
032456
40.0%
216228
20.0%
116228
20.0%
411023
 
13.6%
55205
 
6.4%
Other Punctuation
ValueCountFrequency (%)
.16228
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common97368
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
032456
33.3%
216228
16.7%
116228
16.7%
.16228
16.7%
411023
 
11.3%
55205
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII97368
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
032456
33.3%
216228
16.7%
116228
16.7%
.16228
16.7%
411023
 
11.3%
55205
 
5.3%

antiguedad_venta
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct117
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean43.41767316
Minimum-1
Maximum115
Zeros302
Zeros (%)1.9%
Negative12
Negative (%)0.1%
Memory size126.9 KiB
2022-10-01T23:30:11.772638image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile4
Q118
median40
Q363
95-th percentile99
Maximum115
Range116
Interquartile range (IQR)45

Descriptive statistics

Standard deviation29.10348042
Coefficient of variation (CV)0.670314144
Kurtosis-0.643806599
Mean43.41767316
Median Absolute Deviation (MAD)22
Skewness0.4593440295
Sum704582
Variance847.0125723
MonotonicityNot monotonic
2022-10-01T23:30:11.995654image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11343
 
2.1%
9343
 
2.1%
10326
 
2.0%
8325
 
2.0%
37310
 
1.9%
0302
 
1.9%
36295
 
1.8%
7290
 
1.8%
46268
 
1.7%
47268
 
1.7%
Other values (107)13158
81.1%
ValueCountFrequency (%)
-112
 
0.1%
0302
1.9%
1203
1.3%
2127
 
0.8%
3117
 
0.7%
4100
 
0.6%
5148
0.9%
6237
1.5%
7290
1.8%
8325
2.0%
ValueCountFrequency (%)
11521
 
0.1%
11447
0.3%
11320
 
0.1%
11225
 
0.2%
11138
0.2%
11039
0.2%
10943
0.3%
10861
0.4%
10765
0.4%
10652
0.3%

Interactions

2022-10-01T23:29:59.611397image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:29.168792image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:31.736789image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:34.495409image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:37.453708image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:40.081482image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:42.691611image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:45.267166image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:48.340714image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:51.012207image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:53.661077image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:56.444567image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:59.829411image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:29.388656image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:31.959804image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:34.699698image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:37.658739image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:40.274483image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:42.902174image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:45.911503image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:48.560743image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:51.202232image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:53.879093image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:56.642580image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:30:00.079457image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:29.616985image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:32.190812image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:34.938734image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:37.870421image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:40.502501image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:43.150191image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:46.143538image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:48.810201image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:51.425241image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:54.110173image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:56.874710image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:30:00.298480image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:29.806015image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:32.409189image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:35.143730image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:38.075437image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:40.697529image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:43.355208image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:46.354557image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:49.010211image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:51.613252image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:54.317174image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:57.124726image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:30:00.531931image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:30.025317image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:32.643220image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:35.352742image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:38.272344image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:40.902543image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:43.574235image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:46.586571image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:49.231227image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:51.841269image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:54.529521image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:57.835837image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:30:00.759945image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:30.214329image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:32.854115image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:35.932674image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:38.473346image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:41.089546image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:43.771237image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:46.808300image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:49.430911image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:52.039287image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:54.755523image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:58.053291image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:30:00.957961image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:30.406344image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:33.083289image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:36.134601image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:38.667377image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:41.291577image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:43.973269image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:47.026316image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:49.649926image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:52.246300image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:54.972538image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:58.262307image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:30:01.203992image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:30.654359image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:33.313317image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:36.353611image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:38.868375image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:41.498580image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:44.212284image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:47.249559image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:49.867999image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:52.466995image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:55.206465image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:58.500348image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:30:01.412004image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:30.877707image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:33.549338image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:36.581640image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:39.249404image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:41.739592image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:44.433289image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:47.466573image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:50.099142image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:52.686015image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:55.458507image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:58.727331image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:30:01.633039image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:31.100740image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:33.809341image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:36.790658image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:39.442419image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:42.030574image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:44.644308image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:47.682587image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:50.314156image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:52.908021image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:55.667516image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:58.931366image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:30:01.857130image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:31.313746image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:34.039382image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:37.029662image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:39.672434image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:42.247579image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:44.851324image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:47.900695image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:50.559178image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:53.191042image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:55.882522image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:59.176364image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:30:02.079145image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:31.532758image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:34.254375image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:37.238677image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:39.868471image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:42.480608image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:45.060483image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:48.126696image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:50.785191image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:53.467075image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:56.232544image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-01T23:29:59.387379image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Correlations

2022-10-01T23:30:12.218585image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-10-01T23:30:12.601558image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-10-01T23:30:12.994953image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-10-01T23:30:13.340979image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-10-01T23:30:13.601998image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-10-01T23:30:02.443168image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-10-01T23:30:03.032737image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexzipcodegradesqft_basementviewbathroomsbedroomssqft_abovesqft_living15waterfrontfloorsyr_renovatedsqft_lotpriceconditionsqft_lot15sqft_livingyr_dateantiguedad_venta
01985798006100.002.03.02610.03140.002.00.08481.000000810000.0310008.02610.02015.022.0
114014980338650.011.03.01560.02210.001.00.08955.000000685000.038976.02210.02015.041.0
2329099800580.002.04.02650.02230.002.00.07151.695075725000.0319856.02650.02014.028.0
316305980017900.001.05.01050.01660.001.00.08720.000000274000.038030.01950.02015.053.0
46647980117320.002.03.01310.01620.001.00.06449.000000445000.037429.01630.02015.029.0
55865980408850.002.04.01760.02550.001.00.08760.000000762500.0410376.02610.02014.036.0
680099800480.011.03.01700.02630.001.00.014133.000000979000.0417376.01700.02014.060.0
74731980118780.003.05.02090.02640.002.00.04369.000000540000.034610.02870.02014.07.0
8384809805290.002.04.02700.02730.002.00.08810.000000690000.035100.02700.02014.010.0
913246980727530.001.03.01130.01260.001.00.09673.000000375000.039681.01660.02014.038.0

Last rows

df_indexzipcodegradesqft_basementviewbathroomsbedroomssqft_abovesqft_living15waterfrontfloorsyr_renovatedsqft_lotpriceconditionsqft_lot15sqft_livingyr_dateantiguedad_venta
1621893029814870.001.02.0940.01890.001.00.06000.0246500.028547.0940.02015.061.0
16219268729800860.001.03.01270.01210.001.00.08000.0475000.047875.01270.02014.055.0
16220495589819860.021.02.01170.01380.001.00.08925.0175000.037440.01170.02014.0103.0
16221146981176120.001.02.0860.0980.001.00.02130.0400000.042800.0980.02014.096.0
1622293969806570.002.03.01950.02190.002.00.07263.0409000.035900.01950.02014.07.0
16223144669819870.002.04.01780.01630.002.00.06000.0175000.036000.01780.02014.023.0
16224300569804260.001.03.0840.0920.001.00.05525.0191000.055330.0840.02015.046.0
162255824981067550.002.03.01230.01780.001.00.06771.0310000.036771.01780.02014.024.0
16226167129803870.002.03.01340.01060.002.00.03011.0230000.033232.01340.02014.019.0
1622723798075100.002.03.03240.02970.002.00.07857.0800000.037857.03240.02014.020.0